A Discussion of : “ Process Consistency for AdaBoost ” by Wenxin Jiang “ On the Bayes - risk consistency of regularized boosting

نویسندگان

  • Wenxin Jiang
  • Nicolas Vayatis
  • Tong Zhang
  • Yoav Freund
  • Robert E. Schapire
چکیده

The notion of a boosting algorithm was originally introduced by Valiant in the context of the “probably approximately correct” (PAC) model of learnability [19]. In this context boosting is a method for provably improving the accuracy of any “weak” classification learning algorithm. The first boosting algorithm was invented by Schapire [16] and the second one by Freund [2]. These two algorithms were introduced for a specific theoretical purpose. However, since the introduction of AdaBoost [5], quite a number of perspectives on boosting have emerged. For instance, AdaBoost can be understood as a method for maximizing the “margins” or “confidences” of the training examples [17]; as a technique for playing repeated matrix games [4, 6]; as a linear or convex programming method [15]; as a functional gradient-descent technique [8, 13, 14, 3]; as a technique for Bregman-distance optimization in a broader framework that includes logistic regression [1, 10, 12]; and finally as a stepwise model-fitting method for minimization of the exponential loss function, an approximation of the negative log binomial likelihood [7]. The current papers add to this list of perspectives, giving a view of boosting that is very different from its original interpretation and analysis as an algorithm for improving the accuracy of a weak learner. These many different points of view add to the richness of the theory of boosting, and are enormously helpful in the practical design of new or better algorithms for machine learning and statistical inference. Originally, boosting algorithms were designed expressly for classification. The goal in this setting is to accurately predict the classification of a new example. Either the prediction is correct, or it is not. There is no attempt made to estimate the conditional probability of each class. In practice, this sometimes is not enough since we may want to have some sense of how likely our prediction is to be correct, or we may want to incorporate numbers that look like probabilities into a larger system. Later, Friedman, Hastie and Tibshirani [7] showed that AdaBoost can in fact be used to estimate such probabilities, arguing that AdaBoost approximates a form of logistic regression. They and others [1] subsequently modified AdaBoost to explicitly minimize the loss function associated with logistic regression, with the intention of computing such estimated probabilities. In one of the current papers, Zhang vastly generalizes this approach showing that conditional probability estimates P{y|x} can be obtained when minimizing any smooth convex loss function,

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discussion of Boosting Papers

We congratulate the authors for their interesting papers on boosting and related topics. Jiang deals with the asymptotic consistency of Adaboost. Lugosi and Vayatis study the convex optimization of loss functions associated with boosting. Zhang studies the loss functions themselves. Their results imply that boosting-like methods can reasonably be expected to converge to Bayes classifiers under ...

متن کامل

Process Consistency for Adaboost

Recent experiments and theoretical studies show that AdaBoost can over t in the limit of large time. If running the algorithm forever is suboptimal, a natural question is how low can the prediction error be during the process of AdaBoost? We show under general regularity conditions that during the process of AdaBoost a consistent prediction is generated, which has the prediction error approxima...

متن کامل

Boosting method for local learning

We propose a local boosting method in classification problems borrowing from an idea of the local likelihood method. The proposed method includes a simple device to localization for computational feasibility. We proved the Bayes risk consistency of the local boosting in the framework of PAC learning. Inspection of the proof provides a useful viewpoint for comparing the ordinary boosting and the...

متن کامل

Bayes-risk Consistency of Boosting

The probability of error of classification methods based on convex combinations of simple base classifiers by “boosting” algorithms is investigated. We show in this talk that certain regularized boosting algorithms provide Bayes-risk consistent classifiers under the only assumption that the Bayes classifier may be approximated by a convex combination of the base classifiers. Non-asymptotic dist...

متن کامل

On Weak Base Hypotheses and Their Implications

1 2 When studying the training error and the prediction error for boosting, it is often assumed that the hypotheses returned by the base learner are weakly accurate, or are able to beat a random guesser by a certain amount of diierence. It is has been an open question how much this diierence can be, whether it will eventually disappear in the boosting process or be bounded by a nite amount see,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015